GPGPU Processing in CUDA Architecture

نویسندگان

  • Jayshree Ghorpade
  • Jitendra Parande
  • Madhura Kulkarni
  • Amit Bawaskar
چکیده

The future of computation is the Graphical Processing Unit, i.e. the GPU. The promise that the graphics cards have shown in the field of image processing and accelerated rendering of 3D scenes, and the computational capability that these GPUs possess, they are developing into great parallel computing units. It is quite simple to program a graphics processor to perform general parallel tasks. But after understanding the various architectural aspects of the graphics processor, it can be used to perform other taxing tasks as well. In this paper, we will show how CUDA can fully utilize the tremendous power of these GPUs. CUDA is NVIDIA’s parallel computing architecture. It enables dramatic increases in computing performance, by harnessing the power of the GPU. This paper talks about CUDA and its architecture. It takes us through a comparison of CUDA C/C++ with other parallel programming languages like OpenCL and DirectCompute. The paper also lists out the common myths about CUDA and how the future seems to be promising for CUDA.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoc: GPGPU Programming through Stream Processing with OCaml

ions Skeletons and Composition : Tomorrow 4:30pm OpenGPU workshop DSL Embedded language to express kernel Real World Use Case 2DRMP : Dimensional R-matrix propagation (Computer Physics Communications) Simulates electron scattering from H-like atoms and ions at intermediate energies Multi-Architecture: MultiCore, GPGPU, Clusters, GPU Clusters Translate from Fortran + Cuda to OCaml+SPOC + Cuda/Op...

متن کامل

Parallel processing for SAR image generation in CUDA – GPGPU platform

High resolution imagery from synthetic aperture radar (SAR) video data requires numerical computations of the order of gigaflops (GFLOP). The computational burden increases with the image size and the amount of input raw video signals. General purpose graphic processor units (GPGPU) can play a pivotal role in parallel processing the raw video data to generate SAR imagery in a much faster proces...

متن کامل

Barra, a Parallel Functional GPGPU Simulator

We present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes unaltered NVIDIA CUDA executables as input. It simulates the native instruction set of the Tesla architecture at the functional level and generates detailed execution statistics. Simulation speed is competitive with the less-accurate CUDA emulation mode thanks to optimizations which exploit the inher...

متن کامل

Soft GPGPUs for Embedded FPGAs: An Architectural Evaluation

We present a customizable soft architecture which allows for the execution of GPGPU code on an FPGA without the need to recompile the design. Issues related to scaling the overlay architecture to multiple GPGPU multiprocessors are considered along with application-class architectural optimizations. The overlay architecture is optimized for FPGA implementation to support efficient use of embedde...

متن کامل

Performance Comparison of Asynchronous Transfer Configurations for UHD Game Image Compression with GPGPU

Ultra high definition (UHD) game scenes have caused the memory bandwidth problem. The lossless DPCM-GR based compression algorithm [12] using NVIDIA CUDA(Compute Unified Device Architecture) like general purpose GPU (GPGPU) computing relieves the bandwidth problem without sacrificing image quality, which supports bit parallel pipelining. This paper increases the memory bandwidth efficiency usin...

متن کامل

Adaptable and Efficient Variable Size Template Matching in CUDA

Introduction Increasingly flexible GPUs and the advent of GPGPU (General Purpose GPU) languages, such as Nvidia’s CUDA and the OpenCL standard, offer potential peak performance that far exceeds that of general purpose CPUs for a variety of problems. However, architectural and programming restrictions often prevent programmers from achieving peak performance. Even for problems that map well to c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1202.4347  شماره 

صفحات  -

تاریخ انتشار 2012